{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "t09eeeR5prIJ"
      },
      "source": [
        "##### Copyright 2018 The TensorFlow Authors."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "cellView": "form",
        "id": "GCCk8_dHpuNf"
      },
      "outputs": [],
      "source": [
        "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n",
        "# you may not use this file except in compliance with the License.\n",
        "# You may obtain a copy of the License at\n",
        "#\n",
        "# https://www.apache.org/licenses/LICENSE-2.0\n",
        "#\n",
        "# Unless required by applicable law or agreed to in writing, software\n",
        "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
        "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
        "# See the License for the specific language governing permissions and\n",
        "# limitations under the License."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "xh8WkEwWpnm7"
      },
      "source": [
        "# Automatic differentiation and gradient tape"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "idv0bPeCp325"
      },
      "source": [
        "<table class=\"tfo-notebook-buttons\" align=\"left\">\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/r1/tutorials/eager/automatic_differentiation.ipynb\"><img src=\"https://www.tensorflow.org/images/colab_logo_32px.png\" />Run in Google Colab</a>\n",
        "  </td>\n",
        "  <td>\n",
        "    <a target=\"_blank\" href=\"https://github.com/tensorflow/docs/blob/master/site/en/r1/tutorials/eager/automatic_differentiation.ipynb\"><img src=\"https://www.tensorflow.org/images/GitHub-Mark-32px.png\" />View source on GitHub</a>\n",
        "  </td>\n",
        "</table>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "idv0bPeCp325"
      },
      "source": [
        "> Note: This is an archived TF1 notebook. These are configured\n",
        "to run in TF2's \n",
        "[compatibility mode](https://www.tensorflow.org/guide/migrate)\n",
        "but will run in TF1 as well. To use TF1 in Colab, use the\n",
        "[%tensorflow_version 1.x](https://colab.research.google.com/notebooks/tensorflow_version.ipynb)\n",
        "magic."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "vDJ4XzMqodTy"
      },
      "source": [
        "In the previous tutorial we introduced `Tensor`s and operations on them. In this tutorial we will cover [automatic differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation), a key technique for optimizing machine learning models."
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "GQJysDM__Qb0"
      },
      "source": [
        "## Setup\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "OiMPZStlibBv"
      },
      "outputs": [],
      "source": [
        "import tensorflow.compat.v1 as tf"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "1CLWJl0QliB0"
      },
      "source": [
        "## Gradient tapes\n",
        "\n",
        "TensorFlow provides the [tf.GradientTape](https://www.tensorflow.org/api_docs/python/tf/GradientTape) API for automatic differentiation - computing the gradient of a computation with respect to its input variables. Tensorflow \"records\" all operations executed inside the context of a `tf.GradientTape` onto a \"tape\". Tensorflow then uses that tape and the gradients associated with each recorded operation to compute the gradients of a \"recorded\" computation using [reverse mode differentiation](https://en.wikipedia.org/wiki/Automatic_differentiation).\n",
        "\n",
        "For example:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "bAFeIE8EuVIq"
      },
      "outputs": [],
      "source": [
        "x = tf.ones((2, 2))\n",
        "\n",
        "with tf.GradientTape() as t:\n",
        "  t.watch(x)\n",
        "  y = tf.reduce_sum(x)\n",
        "  z = tf.multiply(y, y)\n",
        "\n",
        "# Derivative of z with respect to the original input tensor x\n",
        "dz_dx = t.gradient(z, x)\n",
        "for i in [0, 1]:\n",
        "  for j in [0, 1]:\n",
        "    assert dz_dx[i][j].numpy() == 8.0"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "N4VlqKFzzGaC"
      },
      "source": [
        "You can also request gradients of the output with respect to intermediate values computed during a \"recorded\" `tf.GradientTape` context."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "7XaPRAwUyYms"
      },
      "outputs": [],
      "source": [
        "x = tf.ones((2, 2))\n",
        "\n",
        "with tf.GradientTape() as t:\n",
        "  t.watch(x)\n",
        "  y = tf.reduce_sum(x)\n",
        "  z = tf.multiply(y, y)\n",
        "\n",
        "# Use the tape to compute the derivative of z with respect to the\n",
        "# intermediate value y.\n",
        "dz_dy = t.gradient(z, y)\n",
        "assert dz_dy.numpy() == 8.0"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ISkXuY7YzIcS"
      },
      "source": [
        "By default, the resources held by a GradientTape are released as soon as GradientTape.gradient() method is called. To compute multiple gradients over the same computation, create a `persistent` gradient tape. This allows multiple calls to the `gradient()` method as resources are released when the tape object is garbage collected. For example:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "zZaCm3-9zVCi"
      },
      "outputs": [],
      "source": [
        "x = tf.constant(3.0)\n",
        "with tf.GradientTape(persistent=True) as t:\n",
        "  t.watch(x)\n",
        "  y = x * x\n",
        "  z = y * y\n",
        "dz_dx = t.gradient(z, x)  # 108.0 (4*x^3 at x = 3)\n",
        "dy_dx = t.gradient(y, x)  # 6.0\n",
        "del t  # Drop the reference to the tape"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "6kADybtQzYj4"
      },
      "source": [
        "### Recording control flow\n",
        "\n",
        "Because tapes record operations as they are executed, Python control flow (using `if`s and `while`s for example) is naturally handled:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "9FViq92UX7P8"
      },
      "outputs": [],
      "source": [
        "def f(x, y):\n",
        "  output = 1.0\n",
        "  for i in range(y):\n",
        "    if i > 1 and i < 5:\n",
        "      output = tf.multiply(output, x)\n",
        "  return output\n",
        "\n",
        "def grad(x, y):\n",
        "  with tf.GradientTape() as t:\n",
        "    t.watch(x)\n",
        "    out = f(x, y)\n",
        "  return t.gradient(out, x)\n",
        "\n",
        "x = tf.convert_to_tensor(2.0)\n",
        "\n",
        "assert grad(x, 6).numpy() == 12.0\n",
        "assert grad(x, 5).numpy() == 12.0\n",
        "assert grad(x, 4).numpy() == 4.0\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "DK05KXrAAld3"
      },
      "source": [
        "### Higher-order gradients\n",
        "\n",
        "Operations inside of the `GradientTape` context manager are recorded for automatic differentiation. If gradients are computed in that context, then the gradient computation is recorded as well. As a result, the exact same API works for higher-order gradients as well. For example:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {
        "id": "cPQgthZ7ugRJ"
      },
      "outputs": [],
      "source": [
        "x = tf.Variable(1.0)  # Create a Tensorflow variable initialized to 1.0\n",
        "\n",
        "with tf.GradientTape() as t:\n",
        "  with tf.GradientTape() as t2:\n",
        "    y = x * x * x\n",
        "  # Compute the gradient inside the 't' context manager\n",
        "  # which means the gradient computation is differentiable as well.\n",
        "  dy_dx = t2.gradient(y, x)\n",
        "d2y_dx2 = t.gradient(dy_dx, x)\n",
        "\n",
        "assert dy_dx.numpy() == 3.0\n",
        "assert d2y_dx2.numpy() == 6.0"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "4U1KKzUpNl58"
      },
      "source": [
        "## Next Steps\n",
        "\n",
        "In this tutorial we covered gradient computation in TensorFlow. With that we have enough of the primitives required to build and train neural networks."
      ]
    }
  ],
  "metadata": {
    "colab": {
      "collapsed_sections": [],
      "name": "automatic_differentiation.ipynb",
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
